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1 Document image underst and i n g 
Sargur N. Srihari 

November 1986 Proceedings of 1986 ACM Fall joint computer conference 
Publisher: IEEE Computer Society Press 

Full text available: I fR pdf(1.38 MB) Additional Information: full citation , references , citin gs, index terms 
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&y purposing and quality assurance 
Steven J. Simske, Margaret Sturgill 

November 2003 Proceedings of the 2003 ACM symposium on Document engineering 
Publisher: ACM Press 

Full text available: t g| pdf( 165.66 KB) Additional Information: full citation, abstract, re f ere nces , in dex t erm s 

We present design strategies, implementation preferences and throughput results 
obtained in deploying a Ul-based ground truthing engine as the last step in the quality 
assurance (QA) for the conversion of a large out-of-print book collection into digital form. 
A series of automated QA steps were first performed on the document. Five distinct 
zoning analysis options were deployed and the PDF output thence generated was used to 
regenerate TIFF files for comparison to the originals. Regenerated TIF ... 
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Jean Duong, Myriam Cote, Hubert Emptoz, Ching Y. Suen 

November 2001 Proceedings of the 2001 ACM Symposium on Document engineering 
Publisher: ACM Press 

Full text available: ^ pdfd.06 MB) Additional Information: full citatio n, abstract , references, index terms 

In this paper, we present a document analysis system which is expected to extract 
regions of interest in greyscale document images. Collected areas are then clustered in 
text zones and non-text areas using geometric and texture features. The system works in 
two steps. Regions of interest are retrieved via cumulative gradient considerations. In 
classification module, we introduced some entropic heuristic. Experiments are done on the 
MediaTeam Document Database to show the relevance of this criter ... 
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